library(knitr)
library(downloader)
library(tidyverse)
library(USAboundaries)
library(USAboundariesData)
library(ggsflabel)
library(remotes)
library(devtools)
library(ggplot2)
library(maps)
library(ggplot2)
library(gganimate)
library(gifski)
library(corrplot)
library(tidyquant)
library(timetk)
library(dygraphs)
library(sf)
library(leaflet)
library(plotly)
library(DT)
For my final project, I chose to examine county-level election data from the 2020 US Presidential general election, in which Democratic challenger Joe Biden defeated incumbent President Donald Trump. The dataset I used was the “US Election 2020” dataset from Kaggle, which contained .csv files for the Presidential, Gubernatorial, and Senate elections held in November of 2020.I made use of the file containing the county-level data for the presidential election, which contained the following variables: state (the state where the data was collected), county (the state where the data was collected), candidate (the candidate for whom the votes were cast), party (candidate’s party), total_votes (vote tally), and won (whether the candidate won that particular county’s popular vote or not, TRUE if so FALSE if not).
In doing my final project, I wanted to focus specifically on data from the state of Missouri- my new state of residence, and the home of William Jewell College. The state of Missouri had long been considered a textbook bellwether/battleground state, carried by the winner of every Presidential election from 1904 to 2004 with the only exception being 1956. However, beginning with the 2000 election, this status began to come into question. Every year since 2000, Missouri has been carried by the Republican Party’s nominee, with GOP margin of victory increasing with each election cycle. With a double-digit Republican margin of victory in 2016, most experts began to classify Missouri as a “solidly red” state. For my analysis, I wanted to investigate how “red” Missouri really was, and weather county-level data could tell us anything about how the state votes in comparison to national trends. My questions were as follows:
As previously mentioned, I made use of the “US Election 2020” county-level dataset, while also using the data contained in the “tidycensus” package, which contains up-to-date demographic and geographic information directly fm the US Census Bureau. For this, I had obtain a custom access key.
election_2020 <- read_csv("president_county_candidate.csv")
head(election_2020)
## # A tibble: 6 × 6
## state county candidate party total_votes won
## <chr> <chr> <chr> <chr> <dbl> <lgl>
## 1 Delaware Kent County Joe Biden DEM 44552 TRUE
## 2 Delaware Kent County Donald Trump REP 41009 FALSE
## 3 Delaware Kent County Jo Jorgensen LIB 1044 FALSE
## 4 Delaware Kent County Howie Hawkins GRN 420 FALSE
## 5 Delaware New Castle County Joe Biden DEM 195034 TRUE
## 6 Delaware New Castle County Donald Trump REP 88364 FALSE
#install.packages("tidycensus")
library(tidycensus)
census_api_key("295d80d4776a52af276650d539f4276c44c4279e")
vars <- load_variables(2020, "acs5")
#View(vars)
mo2020 <- election_2020 %>% filter(state=="Missouri") %>% filter( candidate=="Joe Biden" | candidate=="Donald Trump")
mowinners <- mo2020 %>%
filter(won=="TRUE")
mo2 <- rename(mo, county=NAME)
mo2<-mutate(mo2,county=sapply(strsplit(mo2$county, split=',', fixed=TRUE),function(x) (x[1])))
head(mo2)
## Simple feature collection with 6 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -94.6323 ymin: 36.62738 xmax: -89.31327 ymax: 40.57191
## Geodetic CRS: NAD83
## GEOID county variable estimate moe
## 1 29201 Scott County B29001_001 29123 89
## 2 29227 Worth County B29001_001 1581 26
## 3 29207 Stoddard County B29001_001 22813 63
## 4 29035 Carter County B29001_001 4604 127
## 5 29157 Perry County B29001_001 14536 119
## 6 29031 Cape Girardeau County B29001_001 60603 253
## geometry
## 1 MULTIPOLYGON (((-89.78682 3...
## 2 MULTIPOLYGON (((-94.63203 4...
## 3 MULTIPOLYGON (((-90.25976 3...
## 4 MULTIPOLYGON (((-91.22447 3...
## 5 MULTIPOLYGON (((-90.14678 3...
## 6 MULTIPOLYGON (((-89.8668 37...
mo2 <- merge(mo2, mowinners, by = "county")
head(mo2)
## Simple feature collection with 6 features and 10 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -95.7747 ymin: 36.49872 xmax: -91.40903 ymax: 40.58521
## Geodetic CRS: NAD83
## county GEOID variable estimate moe state candidate party
## 1 Adair County 29001 B29001_001 19819 253 Missouri Donald Trump REP
## 2 Andrew County 29003 B29001_001 13478 33 Missouri Donald Trump REP
## 3 Atchison County 29005 B29001_001 4148 23 Missouri Donald Trump REP
## 4 Audrain County 29007 B29001_001 19340 92 Missouri Donald Trump REP
## 5 Barry County 29009 B29001_001 26792 181 Missouri Donald Trump REP
## 6 Barton County 29011 B29001_001 8889 80 Missouri Donald Trump REP
## total_votes won geometry
## 1 6413 TRUE MULTIPOLYGON (((-92.85637 4...
## 2 7255 TRUE MULTIPOLYGON (((-95.06028 4...
## 3 2199 TRUE MULTIPOLYGON (((-95.77355 4...
## 4 7732 TRUE MULTIPOLYGON (((-92.31384 3...
## 5 12425 TRUE MULTIPOLYGON (((-94.07711 3...
## 6 5168 TRUE MULTIPOLYGON (((-94.61763 3...
mo_map <- mo2 %>%
ggplot(aes(fill = candidate)) +
geom_sf(color = "white") +
coord_sf(crs = 26911) +
labs(title = "2020 Presidential Election Results By County - MISSOURI")+
theme_void()+
theme(legend.position = "right")
Looking into the specific data on votes cast, one can see that President Trump won Missouri with about 57% of the state’s popular vote, compared to Joe Biden’s 41%. Third party candidates and write-ins received less than 2% of the vote collectively. Trump’s 16 percentage point margin of victory, while massive, is not nearly as large as his 97% share of counties won. However, this is well in line with national trends, in which Republicans who tend to be far more dominant in rural districts, tend to win a large majority of a primarily rural state’s counties, even as they tend be much less populous than the Democratic-leaning counties that often contain urbanized areas and major population centers. Many of Missouri’s counties have less than 20,000 registered voters, and it is here that Republican margins of victory tend to be quite substantial. By contrast, the counties won by Democrats include within their borders most or part of most of Missouri’s largest major cities: Kansas City, Columbia, and St. Louis. All in all, Donald Trump won the popular vote by a comfortable margin of 1,718,736 to Biden’s 1,253,014.
vote_count <- election_2020 %>%
filter(state == "Missouri") %>%
group_by(candidate) %>%
summarise(votes=sum(total_votes))%>%
select(candidate,votes)
vote_count <- vote_count %>%
mutate(share=paste0(round(votes/sum(votes)*100,2),"%"))
head(vote_count)
## # A tibble: 6 × 3
## candidate votes share
## <chr> <dbl> <chr>
## 1 Don Blankenship 3919 0.13%
## 2 Donald Trump 1718736 56.8%
## 3 Howie Hawkins 8283 0.27%
## 4 Jo Jorgensen 41205 1.36%
## 5 Joe Biden 1253014 41.41%
## 6 Write-ins 805 0.03%
candidate_share <- ggplot(vote_count, mapping = aes(x = candidate, y = votes, fill= candidate)) +
geom_bar(stat="identity", position="dodge") +
geom_text(aes(label = share), size=5.75) +
labs(title = "Vote Totals and Percentage Share By Candidate In Missouri", x = "Candidate", y = "Total Votes") +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank())
candidate_share
In the interactive graph below, counties of similar sizes are grouped together and arranged in descending order of Republican’s margin of victory in that particular county. The dataset is filtered to exclude the 4 Missouri counties carried by Joe Biden. By raw numbers, Donald Trump had his largest popular vote margin of victory in Jefferson County - a margin of victory of 39,523 on 77,046 total votes cast for him. However, by total share of the votes cast, Trump performed better in many much smaller, more rural counties (ie. Knox County). By grouping counties of similar sizes together the data allows us to discern where Republican performed best in terms of the type of county and its demographic makeup. All in all, it is safe to say that smaller, more rural counties had the highest margins of victory for the GOP. However, it would be remiss not to point out the more populous counties where republicans performed well- St. Charles County, Greene County, and Jefferson County all had over 75,000 votes cast for Trump and delivered sizable margins of victory. In Clay County, our home and the home of William Jewell College, Trump won narrowly with over 64,000 votes cast for him. This despite the fact that Clay County is considered to be one of Missouri’s more relatively liberal counties. All in all, Republicans fared better in big counties that contained more suburbs, and were generally an urban/rural mix as opposed to being predominantly urban.
rmargin <- mo2020 %>%
filter(party == "REP", won == "TRUE")
dmargin <- mo2020 %>%
filter(party == "DEM", won == "FALSE")
r_wonby <- rmargin %>%
mutate(repub_margin_of_vic= paste0(rmargin$total_votes-dmargin$total_votes))
rwonby2 <- r_wonby %>%
arrange(desc(repub_margin_of_vic))
interactiv_p <- ggplot(data=rwonby2, mapping=aes(total_votes, repub_margin_of_vic, color=county)) +
geom_point() +
labs(y = "Margin of Victory (# of Votes)",
x = "Total GOP Votes",
title = "Which Missouri counties had the highest GOP margin of victory?") +
theme(legend.position = "none", panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black")) +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
ggplotly(interactiv_p)
All in all, the data and visualizations contained in this report yielded very interesting findings, and succeeded in providing adequate answers to my data-driven questions. As for the state’s reflection of national trends, we can assume the following: - Missouri’s election results are in line with national trends, but are perhaps even more pronounced in terms of the urban/rural divide and the expected results in more populous counties. - The counties won by Democratic nominee Joe Biden were all characteristic of Democratic-voting counties in otherwise “Deep Red” states across the country (urban, more densely populated).
Given the massive difference in counties won, a double-digit difference in the popular vote margin, and across the board dominance by the GOP, it is absolutely fair to classify Missouri as a “Solidly Republican” state. Furthermore, if if the trends of the past two decades continue, it is not implausible for Missouri to reach “Deep Red” status in the next two election cycles, entering the list of the United States’ Top 15 most conservative states.